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INTRODUCTION 


The  development  of  a  theory  of  statistical  tests,  as  distinct  fro*  a 
collection  of  special  examples,  nay  be  said  to  have  begun  with  the  intro- 
duction of  the  notation  of  typaa  of  error  by  Neyman  and  Pearson  in  1928  and 
1933  1.2/.    Correspondingly,  the  initiation  of  a  theoretical  attack  on  the 
daaslflcation  problem  nay  be  aaid  to  have  begun  when  the  Ne yuan-Pearson 
ldeaa  vera  adapted  to  the  discriminant  function  by  Welch  1939  £/, 

Valeh  only  considered  the  problem  of  classifying  an  individual  Into 
one  of  two  populations ,  say      and  Wg,  and  further  restricted  the  problem 

by  assuming  that  the  probability  density  function  of  the  aeaaured  quantities 
is  completely  known  within  each  of  the  populations.    Let  ^^l*  *2'***°  xp* 
denote  the  probability  denalty  of  the  obaarvable  quantities  X^,  X^,***,  Xp 
in  »j  and  let  t^frjt  *!»••••  xp'  b*  the  corresponding  density  in  w^.  Welch 

2/  observed  that  many  methods  of  classifying  an  individual  Z  into  one  of 

the  two  populations  on  the  baala  of  the  observations  on  X_,  X.,,..,  X  t 
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amount  to  a  partition  of  the  p-dlmentlenal  smmpls  space  of  the  X'a  into  two 
regions ,  say  R^  and  Rj,  with  the  rule  that  X  will  be  as signed  to       if  the 
random  point  with  coordinates  (X^M..,  X^)  falls  in  R^  and  will  be  aaalgned 
to  »2  if  (X^,  Xg,.*.,  *p)  falla  in  R^,    The  choice  of  a  rula  for  claasi  fleet  ion 
or  discrimination  la  thus  equivalent  to  the  choice  of  a  partitioning  of  the 
sample  space  into  the  regions  R^  and  R^. 

A  discriminant  function  may  be  used  for  the  purpose  of  classification 
in  the  case  of  two  multivariate  normal  distributions  with  different  mean 
vectors       and      and  common  coverlence  matrices  J  A/.    We  shall  consider 
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the  classification  problem  in  the  case  of  ^  f*  ^  and  ^  i*  £2  where 

and      ara  from      and  jjg  and  J2  ara  f rom  »2« 

Suppose  tvo  vectors  it's  with  p  components  ara  distributed  in  accordance 
with  R(£&,  lx)  and  H(u^,  J2).    The  density  function  of  the  ith  (i  -  1,  2) 
distribution  is 

(1)    ofe  I  V  It>  -  (2w)"p/2  IIJ"172  axpl-l^^-^)'^"1^- j^)) 
The  beat  procedure  for  discrimination  will  be  the  likelihood  ratio, 

i.e.. 

An  individual  la  elaaalfiad  into  *^  if  L  is  laaa  than  a  constant,  say  c, 
and  into  *2  if  L  £  c.    If      m  l2»  then  L  is  s  linear  function  of  &• 

In  the  p-variate  esse,  b/JfH  -  c,  which  is  an  ellipsoid,  where  b_  is 
a  vector  with  p~components,  the  ehspe  and  rotation  of  the  ellipsoid  are 
determined  by  J,  and  the  else  is  determined  by  c.    Consider  two  populations 
being  centered  around  different  points  and  with  different  patterns  of 
scatterings.    Under  such  a  circumstance,  the  simpleet  procedure  for  class- 
ifies t  ion  will  be  a  line  or  hyperplasia.    The  following  discussion  will  be 
restricted  to  a  linear  procedure. 

Mow,  let  b,  i  0  be  a  vector  of  p  components,  and  c  be  a  scalar.  An 
observation  %  la  elaaalfiad  aa  from  w  ,  if        <  c  and  as  from  »2,  if 


Suppose  u.^  4  jjg  and  \^  4  \^  and  further  uium  that   £   Is  eanplad 
fron  the  1th  (1  -  1,  2)  population  then 

jnr  aim  frem  Jj.  * 

The  probability  of  nlaclaaalfylng  an  observation  when  It  la  free 


'1  U 


Pl  "  1  *  pifl} 


where 


and  the  probability  of  ■laclaaalfylng  an  observation  when  It  cones  fron 


.2u 


where 


P2  '  1  "  p(y2) 


p<y2>  •  /a»<°»  x>d> 


and  where 
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1  (k'lekr 


It  is  desired  to  alnialse       and/or  P^  or  to  eaxlaixe  y^  and/or  y^. 
Soae  ■  pact  fie  p rob Isms  are  to  find 

X)  a  procedure  that  alnialsee  the  probability  of  one  error  of 

daaalficatlon  when  the  other  la  opacified* 
2)  the  procedure  that  alnlalaee  the  aaximua  probability  of  error. 
The  above  solutions  will  be  found  within  the  eat  of  edalssible  linear 
procedures  £/• 

ADMISSIBLE  PROCEDURE 

Each  procedure  in  terms  of  the  two  probabilities  of  ais classification 
can  be  put  into  the  fraae-vork  of  the  general  decision  p rob lee  6/.  There 
ere  two  possible  decisions       and  D^.    The  appropriate  decision  depends  on 

the  value  of  y^  end      which  are  elenente  of  saatple  space  R.    R  can  be  de- 
coaposed  into  two  aubspaces  R^  end  R2  ao  that  decision       la  preferred  if 
y^  belongs  to  R^  and       la  preferred  if       belongs  to  R^«    The  probability 
of  aleelaeelflcatlon  aaaociated  with  D  and  y  la  given  by  the  Xoaa  for  aia- 
cleasl fleet ion  a(D|  y),  where 

if  y^  belongs  to        then  a(D^;  y^)  "  0 

if  y2  beloage  to  R^,  then  a(D2f  ya>  -  0 

An  etteapt  la  aade  to  alnlalse  a(D;  y)  (>.  0) . 


Definition*    A  procedure  is  admissible  if  there  is  no  other  procedure 
which  is  better* 

In  the  present  case,  one  linear  procedure  is  better  then  another  if 
the  e(D;  y)  obtained  for  that  procedure  is  at  la est  ae  snail  as  the  cor- 
responding S)(D|  y)  for  tbs  other  procedure. 

Consider  the  y^,  y^-plano.    from  (2) ,  we  obtain 

y^lg)1'2  -  c  -  ^ 
y2<k9l£)l/2  -  *\  -  c 

Solving  for  y^ 

.1/2 


(3)         yi  r  hi 


where  4  -  %  ~ 

For  a  given  y2,  y^  is  s  function  of  Jb,  i.e.,  y^  -  1(b).    from  (3), 
y^  is  continuous  everywhere  in  b_,  except  £  ■  0;  and  y^  is  homogeneous  la 
b^  of  degree  aero,  I.e..  under  a  scalar  transformation  of  b,,  y^  is  in- 
variant.   We  can.  therefore,  restrict  jb  to  lis  on  an  ellipse,  say 
b/][b  -  k,  which  is  closed  end  bounded,    y^  is  continuous  and  has  a  maxinus), 
Thus,  for  a  given  y^,  y^  has  a  aaxlsraa. 

We  dlfferentltate  (3)  with  respect  to  y2,  we  get 


^2  "  "  (b'l'bj1/2 
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which  la  a  negative  quantity,  therefore       ia  a  decreasing  function  of  y^. 
Fron  (3) 


£1-  T2<k'l£) 


*1 


k-l/2 


1/2 


k-l/2 


-1/2 


0. 


Multiplying  hy  (h/Ijb) 


1/2 


4  -  yjfc'IjkT172  (Ijb) 


1/2 


tfjW  -0 


whore 
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1/2 


Ut 


1/2 


1/2 


1      1    S  <fc'I,» 


2      1    3  (t'l^Vlxi) 


(*)      •*  »  "  (tJx  ♦  tjlj)"1 1 

Notice  that  a  point  (y^,  y^),  7^       meainiaed  with  respect  to 

b.  for  a  given  y^,  corresponds  to  an  admissible  procedure  (thia  will  be 
proved  later).    By  definition  for  a  given  point  on  the  ellipee  h'j'b  ■  k, 
there  exiata  one  and  only  one  line  through  the  point  (y^t  y^)  which  la  the 
tangent  to  b'[b  -  k.    Thia  implies  the  fact  that  y^  haa  one  and  only  one 
with  respect  to  b_  for  a  given  y^. 
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Fro*  (2)  ond  (3) 


-kti-y2<ktIJBx/,*lLtji1 


or  «  -      -  y2  <k'l£)l/2 


Thooron  1*    If  a  point       >  0#  g2  >  0  1»  edmlaolble,  chore  oxlot 

o 

tx  >  0;  t2  >  0  such  that  tho  procoduro  lo  do  flood  by  (4)  ond  (5)  JJ/. 

Proof t  Lot  tho  odaloolblc  procoduro  bo  do flood  by  tho  voctor  &  ond 
■color  a.    Tho  lino 
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with  *  as  parameter  has  the  point  (g^,  gj)  on  it< 
Solving  for  a,  vt  obtain 

,1/2 


71 "  Qi'I^1'2 


(6»)  dy^  Ci'Iaa)172 

id  since  both  J_  and  £2  are  positive  definite,  d/j/dyj  is  nsgativs. 

Hence,  thsrs  sxlst  positive  numbers  t^  and  tj  and  k  such  that  tha  Una 

1/2 


&'l  -  refe'Iei)' 

y       -   i  *— 


la  tangent  to  tha  ellipse 


(7) 


3a>   +  a 


at  tha  point  (g^,  g^)*    Tha  slope  of  tha  line  tangent  to  tha  ellipse  at 


a  point  <yj,  y»)  is 


y't 

2  1 


Consider  a  line  defined  by 
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(8) 


Tl*^i^  ''"Zip* 


where  b  li  i  coefficient  vector  end  e  Is  a  scaler  which  uy  be  so  chosen 
that  the  line  Is  the  tangent  to  (7)  at  point  (yj,  yj). 

From  (7) 


iff  ii  i  ■*         +     ff 1        ft     ■  ii  11  a 


(6») 


(c  -  b'j^)  dy1       (b*^  -  c  ) 


-  0 


art**)172  ^vleW 


1/2 


1/2 


^,(tJi  ♦  *2  *2>k 


Substituting  (9)  into  (g),  we  obtain 
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(10) 


.3 


Substituting  (10)  into  (7),  wa  g«t 


Tharefora,  tha  point  (yj^,  yj)  Is  on  tha  ellipse  with  a  constant 


(12) 


(y?>2 


(f!)2 


<k'*> 


*'(ti*i  *  '2*2^ 


Tha  right  hand  alda  of  (12)  la  naxlnised  with  raapact  to  b^  where  b_ 

van  glvan  In  (A).    Thla  procadura  must  ba  adaissibla.    For,  If  thara 

vara  a  vector  £  auch  that  (12)  vae  largar  than  k9  tha  point  (g^,  g^)  would 

ba  within  tha  alllpaa  with  constant  (11)  (Fig.  1).    Thla  contradicts  tha  fact 


it 


Fig.  1 


that  the  procedure  is  Admissible,  for  (g^t  g2>  would  be  nesrer  the  origin 
then  the  ten  gent  st  (yjt  yp»  then  some  points  on  this  tangent  line  (cor- 
responding to  procedures  with  vector  b_  end  scaler  c)  would  be  better*  This 
proves  the  assertion. 

Since  lx  and  l2  are  positive  definite,  and        t2  >  0,         ♦  t2J2 

is  also  positive  definite*  The  right  hand  side  of  (12)  Is  homogeneous  of 
degree  saro  in  Jb,  therefore  for  any  scalar  transformation  of  the  value 
of  k  is  invariant* 

When  b  is  given  by  (4) 


d  - 


d»b  - 


b'd.  - 


Then  (10)  reduces  to 


k1  (tjx  ♦  t2I2)k 


(13) 


1/2 


y2  "  «2dLfIaW 


1/2 


13 


For  simplicity,  ws  normal lie  t]L  ♦  t2  •  1  if  t^tj  >  0,  ^  -  tj  -  1, 
if      >  09  tg  «  0  and  «2  -  tx  -  1  when       <  0,  tg  >  0. 

Una*:    Any  positive  dsflnlts  syaaetrlc  aatrix  A  esa  bs  expressed 
as  ths  product  of  s  non-singular  aatrix  sod  Its  transpose.  Conversely, 
svsry  such  product  is  s  positive  dsflnlts  syaaetrlc  aatrix  J/* 

Proofs    If  A  is  nan  sad  positlvs  dsflnlts 9  thsa  ths  rank  of  A  Is  a. 
This  assas  that  ths  congruent  cannon leal  form  of  A  is  I •    Thsrsfore,  thsrs 
sxlsts  s  non-singulsr  astrix  P  such  thst 

P'AP  •  I 

A  -  (P'r^P)"1 

-  <p~V<p"1> 

-  Q'Q 

whs re  Q  -  P~l.    Convsrssly*  if  Q  is  non-singulsr,  thsa  Q'Q  is  non-singulsr. 
This  iapllas  thst  Q'Q  is  congruent ly  siallar  to  which  is  ths  Kronscker 

delta. 

(Q'Q)'  -  Q'Q 

iapllas  that  Q'Q  is  syaaatrlc.    Therefore,  Q'Q  is  positive  definite*  (Q.E.D.) 

We  know  that  ^  and  £2  are  positive  definite*    There  exists  s  non- 
singular  astrix  P  such  thst 

l2    -  M'» 

where  N  -  P*"1.    Aad  there  exists  so  orthogonsl  astrix  R  such  thst 

■'(P'^P)*  -  M 
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vhcra 


Notice  that 


lapllcs  that 


Define 


From  (13) 


M  -         «t)|  m±  £  ni4tl9  J,  1  -  1,  ....  P. 


•  RfR 


(PR)'I2(PR)  •  P«J2P 


PR  -  P 


<PR)»I2(PR)  -  H«I2M 

-  t1({(t1N,MN  ♦  t2M,H)"1M,2L),ll,MM{(t1ll,MI  ♦  t2»,M)'"1M,X() J1/2 

-  t^'U^  ♦  tjD'V^M  ♦  t2l)"xxjx/2 


For  the  mm  rtwon 

(13) 


From  (14) 


(16) 


The  quantity  la  positive,  since  t^9  tj  >  0,  this  mai  that  y^  la  an  In- 

2 

exaaalng  function  of  t^«  The  same  arguaent  shows  that  y2  it  a  decreasing 
function  of  t9»  i.e.. 


Theorem  2.    The  procedure  defined  by  b_  •  +  C2^2^"  ^  *nd  C  *lv*n 

by  (3)  for  any  tj  and  t2  such  that  t^  *  t£2  la  poaltlve  definite  la  ad- 
missible JJ/. 

Proof i 

1)    •«  ♦  t.  ai  l,  i.e.,  Ut  ^  >  0,  t2  >  0 

If  the  procedure  defined  by  b  ■  ^jJi  4  t2^2*~1^  w*r*  not  *<ta,lafllbl*f 
there  would  he  an  admissible  procedure  which  would ,  by  theorem  1,  ha 


(17) 


-  -  2 
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defined  by  a.  -  (t^  «►  *2U*~l*  iot  g0m*  Tl  *  T2  "  X*  <16>»  <17> 

2  2 

Indicate  that  y^  and       ara  increasing  and  dacraaaing  function*  of  t^9  ra- 

apectlvelyt  ona  of  tha  ooordlnataa  corraa ponding  to      would  hava  to  ba 
laaa  than  ona  of  ooordlnataa  corresponding  to  t. •    We  can  conclude  that  tha 
procedure  defined  by        <TjIi  +  T2^2^"*^      not  b*tter  than  th*  P*°cedure 
originally  defined  by  b^  and  thle  contradict a  the  eeeunption.    Therefore , 
the  original  procedure  la  adatlaaible. 


11)   tx  -  0.  t2  -  i 


If  tx  -  0.  from  (13)  t  jx  -  0  and  j2  m  tjCk'I^1^    But  fro"  (3) 


71  ^5^" 


alnce  y^  •  0,  we  get 


Jt'A 


Tm  (13) 


Thle  la  a  form  for  b,  which  naxlnliea  y^.    Therefore,  the  procedure  la 
admlaalbla.    In  tha  caae  oft2"0vt^"l,  the  reault  will  be  alallar 
to  the  above  argument. 
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Reeerk!    When  j2  la  Miiiiitd  with  respect  to  b,  we  obtain 
b  -  (tjx  ♦  t2I2)"Xi.    XnthtfiiM  (11),  tx  -  0.  t2  -  1, 

Ill)    tx  >  0  and  t2  <  0,  I.e.,  1st  tx  -  tj  -  1* 
Consider 


7  7 


1/2 

which  Is  a  hyperbole  which  cut*  y^-asle  at  .    Wa  are  Interacted 

In  Cha  right  hand  branch.  Pron  (13),  yx  >  0  and  j2  >  0,  substituting  (2)  into 
(18)  ,  wa  gat 

2 


(c  -  b/^)2  (b'ju  -  c)- 

n  !  +  •         n        a  it 


at) 


H  »2 


As  wa  hava  shown,  the  eanlnue  value  for  (19)  with  respsct  to  c  for  given 
£  la  exactly  the  sane  fore  as  given  In  (5)  and  (a).    The  aaxlnun  of  (19) 

la  given  by  b  -  (t^  ♦  ^^"^i  •        tn#9m  x*  u  follo*«  that  Che 
procedure  Is  adalsslbls.    We  note  that  In  the  esse  of  t^  «  0,  t^  >  0 
la  slnllsr  to  the  case  proved  above. 


I 

4 


•  *  v 


USB  0?  ADMISSIBLE  PROCEDURES 

1)    Minimisation  of  cms  probability  of  ■lsclssslficstloo  for  s 
spscifisd  probability  of  tha  othsr  £/• 

For  given  y^  >  0f  l«e.,  tha  probability  of  aiselaaalfleation  whan 
■asp ling  froa  »2 ,  we  want  to  aexlaisa  y^t  l.a. t  we  want  to  nlnlalse  tha 
probability  of  aiselaaalfleation  whan  s sapling  froa*  w^. 

•)    tx  ♦  t2  -  1 

If  aaxiaun  yx  >,  0,  we  want  to  find  tj  -  1  -        ao  y2  -  t2(^l^)l^29 
whsre  b,  -  (tjli  +  t^J^)"^  £.    Tha  solution  can  ba  spproxlastsd  by  trial 

1/9 

and  error,  sines  y2  is  an  lncreaalng  function  of  t2>  until  CjCL'I^ 
agrees  closely  with  tha  dealrsd  y^« 

»  «x  -  «2  -  x 

y2  la  a  decreasing  function  of  t2       1).    y2  •  (i'lji)1'2  whan  t2  •  1 

If  tha  given  y2  >  (i'l^d)1^2,  than  y^  <  0,  and  wa  eearch  for  s  value  of 

t2  ao  that  tha  given  y2  -  t^ljj)1^2. 

11)    The  alniuax  procedure  Sj , 
This  procedure  Involves  finding  jb  and  t  to  aatiafy 

(20)  (t^  ♦  <i  -  t)I2ib  -  4 

(21)  b'lt2^  -  (1  -  t)2[2)  b.  -  0,        0  <  t  <  1. 
Consider  (21)  aay  ba  taken  aa 

k'  <IX  -  *I2>k  -  o 

vhare  v  -  [(1  -  t)/t]2. 


There  la  a  non-singular  matrix  II  such  that  J2  •  H*N  and       *  N'MN, 
where  A  la  a  diagonal  matrix  with       (J  •  1,  2,  •  .  .  t  p)  roots  of 
IJj  -  \l2\  -  0.    On  substitution  of  these  values  in  (21)  the  quadratic 
form  reduces  to 

(b.V  (A  -  vX)(bJ*)  -  0, 

a 

where  J>    »  Mb,  which  la  equivalent  to 

I    (K  -  *><b*)2  -  0. 
J-l     3  J 

This  will  not  have  a  non-xero  solution  for  b,*,  if  the  factora  (X^  -  v) 

arc  all  positive  or  are  all  negative*  This  means,  for  minim ax  solution, 
v  will  have  to  lie  between  the  largaat  and  the  smallest  roots  of 
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APPENDIX 
Munerlcal  Exaaple 

In  ord«r  to  illustrate  the  nlnlnan  procedure  developed  In  this 
report,  a  numerical  example  la  given  below.    The  data  for  thla  example 
are  iron  Jollcoeur  and  Moalmann  9/. 

Throe  character*  of  carapace  distent  lone  of  painted  turtles 
(Chrvaamvs  olcta  maralnata)  vara  meeoured  in  saw 


24  Males 

24  Females 

Length 

Vldth 

Height 

Length 

Vldth 

Height 

y  3 

/e 

ox 

Jo 

OA 

/O 

ea 

99 

aw  J 

oA 

nm 

JO 

xo 

en 

oil 

mm 

99 

tats 
IIP s 

ea 

o* 

a* 

101 

84 

39 

105 

86 

40 

102 

85 

38 

109 

88 

44 

103 

81 

37 

123 

92 

50 

104 

83 

39 

123 

95 

46 

106 

83 

39 

133 

99 

51 

107 

82 

38 

133 

102 

51 

112 

89 

40 

133 

102 

51 

113 

88 

40 

134 

100 

48 

114 

86 

40 

136 

102 

49 

116 

90 

43 

137 

98 

51 

117 

90 

41 

138 

99 

51 

117 

91 

41 

141 

105 

53 

119 

93 

41 

147 

108 

57 

120 

89 

40 

149 

107 

55 

120 

93 

44 

153 

107 

56 

121 

93 

42 

155 

115 

63 

125 

93 

43 

155 

117 

60 

127 

96 

45 

158 

US 

62 

128 

95 

45 

159 

118 

63 

131 

95 

46 

162 

124 

61 

135 

106 

47 

177 

132 

67 

Table  1    Three  cherectera  of  carapace  dlaentlons  of  pointed  turtles. 
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Th«  classification  of  vector  random  variables  as  coating  from  ona  of 
two  multivariate  normal  distributions  loads  to  linear  likelihood  procedure 
when  too  covariance  matrices  are  equal*    It  is  possible  to  find  s  linear 
admissible  procedure  which  will  minimise  the  probability  of  mleclaaeiflcatlon 
of  observations  from  one  of  two  multivariate  normal  distributions  with  dif- 
ferent moan  vectors  (u.  t  jig)  sad  covariance  matrices  (J^,  £2>.    Two  ap- 
proachea  have  been  discussed  la  this  report!  1)    minimisation  of  one  prob- 
ability of  mlsclaasiflcatlon  for  a  specified  probability  of  the  order,  2)  min- 
ima* procedure*    The  former  approach  minimises  the  probability  of  mlaclaaslf- 
ieatloa  whoa  sampling  from        and  the  latter  finds  a  value  of  t  which 
ranges  between  the  mast ma 3  and  minimal  characteristic  root  of  ^  -  xJ2  •  0, 
2  2 

such  that  J»'[t       -  (1  -  t)  l2 lb,  •  0*    Aa  example,  using  carapace  dimensions 

of  painted  turtles,  has  bean  cited  to  illustrate  the  application  of  the 
procedures* 
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